This form is a web page which was created in MS WORD and therefore can be easily edited that way

Entry Name: "PKU-Chen-MC3"

VAST 2013 Challenge
Mini-Challenge 3: Visual Analytics for Network Situation Awareness

Team Members:

Siming Chen, Peking University, simingchen3@gmail.com, csm@pku.edu.cn PRIMARY (Point of contact for questions/answers)

Fabian Merkle, Universität Stuttgart, merklefn@studi.informatik.uni-stuttgart.de

Hanna Schäfer, Universität Stuttgart, schaefha@studi.informatik.uni-stuttgart.de

Hongwei Ai, Peking University, hongwei.ai@pku.edu.cn

Cong Guo, Peking University, cong.guo@pku.edu.cn

Xiaoru Yuan, Peking University, xiaoru.yuan@pku.edu.cn (Supervisor)

Thomas Ertl, Universität Stuttgart, Thomas.Ertl@vis.uni-stuttgart.de (Supervisor)

Student Team: YES

Analytic Tools Used:

AnNetTe 安－内－特, developed by the Peking University's and University of Stuttgart's VAST collaboration team, 2013

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2013 is complete?

Yes

Video:

http://vis.pku.edu.cn/vastvideo2013.wmv

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

MC3.1 – Provide a timeline (i.e., events organized in chronological order) of the notable events that occur in Big Marketing’s computer networks for the two weeks of supplied data. Use all data at your disposal to identify up to twelve events and describe them to the extent possible. Your answer should be no more than 1000 words long and may contain up to twelve images.

Our events will be presented chronologically (timeline and tool description in question 3) except those occuring repeatedly, which will be named at their first occurrence.

Mo 01 4.30pm after the regular work time connections the external IP traffic dminishes and disappears around 12am. The main activity remaining is the continuous scan of the health monitor 172.10.0.6. considering the pattern of the following nights, this is could be a network breakdown because of a earlier event or the unhealthy status.

Tue 02 5am after night the first attack starts. IPs 10.6.6.6, 10.6.6.13, 10.6.6.14, 10.7.6.3, 10.7.7.10, 10.10.6.2, 10.11.6.15, 10.16.5.15 , 10.18.6.123 and 10.100.1.6 use many ports between 10000-70000 to acess the destination port 80 of the main http server 172.30.0.4. It is a DDOS because of high CPU load and the high connection payload. The health status becomes worse while the attack and recovers after the attacking IPs disappeared at 7am.

Wed 03 9.30am after the regular rest of day 2 and day 3 IP 10.15.7.85 starts attacking with ports 200-70000 the port 80 of 172.20.0.15. Some other IPs participate, but never reach that extent. It is a scan because of high CPU load and the high payload of the connections. The health status becomes unhealthier while the attack and becomes healthy again after the attacking IPs disappeared at 7am.

Sat 06 11.30am the regular day 4 and day 6 IPs 10.9.81.5 and 10.10.11.15 start attacking with ports from 50000-70000 towards many destination ports on server IPs of every company part. IP 10.10.11.15 only attacks 172.20.0 IPs and stops after 11.45am. Lower payload, but high destination ports and IP count indicate a network scan. The event ends on day 7 at 3.20am and becomes the regular night pattern. However after 7am system becomes very unhealthy and the administrators shut down the network after 9am for two days of installing preventive measures.

Week 2: The system restarts on day 10 at 6.30am and firstly only allows internal IPs.When the network opens it becomes unhealthier again. Using the IPS log we could identify that the three IPs 10.13.77.49, 10.138.235.111 and 10.6.6.7 attack at several daytimes. Each event uses only some source ports from 30000-5000 and many destination ports of which a lot become blocked by a denial of connection logged in the IPS data. The attack aims at server IPs in all company parts. Lower payload, but high destination ports and IP deny logs indicate a network scan.The scans are most active on day 10 from 12.30pm–5.15pm, on day 14 from 11.10am–5.20pm and on day 15 from 7.45am–9.59am.

Thu 11 10.30am after event 5 in day 10 and the night to day 11 are regularly. On day 11 at 10.30am the IPs 10.12.15.152 and 10.6.6.7 attack with some ports from 30000-70000 and a lot of destination ports the server IPs of all company parts. IP 10.6.6.7 goes from 12pm to 9.15pm and is only denied connecting in the second company part. Since CPU, IPS and bytes are high this attack probably consists of a DOS and a scan part. The health status stays constant but becomes better after the attack ends on day 12 at 6.20am.

Thu 11 12.15-1.00pm during event 6 there is a hidden event which consists of a DOS from many bad guys converting into a backwards port scan from server 172.30.0.4 connecting to many external IPs with the port 80 and many destination ports. The CPU load and bytes are very high as well as the destination port entropy.

Fri 12 10.30am after event 6 IPs 10.12.15.152 and 10.12.14.15 attack with many ports between 10000-70000 and destination ports 80 and 3389, which is used for remote desktop services, the servers of company part one and two. The event is a scan with many deny logs in the IPS data, high CPU load and high payload of the connections as well as low destination port numbers. The health status stays constant and only becomes better some hours after the attack ends on day 12 at 3.450pm.

Sat 13 6am continuing event 6 and 8 IP 10.12.15.152 attacks with IP 10.17.15.10 using some source ports and destination ports 0, 25, 80 and 3389 of the servers of company part one and two. The event is a scan with many deny logs in the IPS data, high CPU load and high payload of the connections as well as low destination port numbers. The health status stays constant and becomes better after the attack ends on day 13 at 10.45pm.

10.

Sat 13 11.20pm-1.40am a reaction attack to event 9 is showing in two ways. First all the internal IPs connect to the broadcast IP 239.255.255.255. Second the IPs at 172.10.1 are have a high connection rate to the 10.0.0 IPs. Both could result from some virus given by event 9.

11.

Sun 14 2pm-3.10pm during event 5 on day 14 there is a short time in which many IPs (10.15.7.85, 10.12.15.152, 10.17.15.10, 10.12.14.15, 10.200.20.2, 10.156.165.120, 10.70.68.127, 10.250.178.101, 10.170.68.127, 10.179.32.181, 10.179.32.110, 10.78.100.150., 10.247.58.182, 10.247.106.27, 10.10.11.102) connect to 172.10.0.4, 172.20.0.4, and 172.30.0.4. This event is interesting, because it combines many IPs from earlier attacks in one event. They use many ports in both directions and also achieve a very unhealthy state for the server IPs.

12.

Sun 14 11.45 pm-1.45am is an exception from this regular night. At that time the system has no external connection, which was not done intentionally according to our question, but because of some network problems. Event though it is a regular night, the health status since the attack of event 5 on day 14 is the worst of both weeks. This might indicate, that the event 5 are strong enough to break down the network.

MC3.2 – Speculate on one or more narratives that describe the events on the network. Provide a list of analytic hypotheses and/or unanswered questions about the notable events. In other words, if you were to hand off your timeline to an analyst who will conduct further investigation, what confirmations and/or answers would you like to see in their report back to you? Your answer should be no more than 300 words long and may contain up to three additional images.

Hypothesis 1 is about the malicious goals of a IP group. Main members of this movement are 10.9.81.5, 10.10.11.15 and 10.15.7.85 for week 1 and the IPS recipients IPs 10.6.6.7, 10.138.235.111 and 10.13.77.49 for week 2. The type of attack suggests, that after succeding in event 4 IPs changed from DOS to scans, also using the 3389 port, which is the remote desktop service. Looking closely the IP 10.6.6.7 has a common attack with IP 10.12.15.152 which has common attacks with the IP 10.12.14.15 and 10.17.15.10. Then in Event 11 we can see that they work together in one attack, except the 10.9.81.5 and 10.10.11.15, which were maybe blocked after causing the shut down of the network.

Hypothesis 2 is about the health status development. In the first week we can see that the accumulated health status of the system mostly is stable during the night, but the gets very unhealthy at the start of each day around 7am shortly. The same time there is a null point in the connection entropy, which might indicate a small breakdown. In the second week those changes around 7am are a health leak instead of a peak. At the same time the null point is less deep. These daily events are probably a result of what happens in hypothesis one, because the small breakdowns happen after the DOS or the scanning. This would also explain why the breakdowns are less in the first part of week 2 after blocking those malicious IPs. Unfortunately in day 13 the movement becomes unhealthy again, which might be caused by the attackers breaking the preventions of the administrators.

Thirdly we found great differences between source and destination variables of one connection, which should be investigated but isn’t resolved yet.

MC3.3 – Describe the role that your visual analytics played in enabling discovery of the notable events in MC3.1. Describe whether your visual analytics play a role in formulating the questions in MC3.2. Your answer should be no more than 300 words long and may contain up to three additional images.

To find out the events for question one we used our tool AnNeTe. It consists of one timeline using overviews, for IP/ Port entropy, CPU/byte load plus IPS and health data as well as one ring graph and one river view for the details to visualize the data connected in an interaction pipeline.

If we wanted to find any anomaly, we chose a time in the timeline and play the animation of the ring view. This way we see any anomal connection.

If we wanted to find a DOS attack we can use various features. First we take a look at the overview timeline. There we can search for peaks in the CPU load, the total bytes or in the entropy of the source ports. If we find such a peak, we select it and the refine our selection in the detailed timeline of the four entropy lines. Now we look at the ring graph and can easily see, which IP group is causing the peak. If we cannot find it easy, we can exclude some groups to reduce the clutter.

Now if we select it, we get the river view of that IPs showing all IP connections and used ports along a timeline as well as the health and IPS attributes of the connections.

If we on the other hand want to find a network scan, we look at the peaks of the IPS line and then follow the steps like in an DOS attack. For the port scans we have to look closely into the destination port entropy, then select those peaks and again follow the steps for assurance. For finding connected story lines it is easiest to focus in the development of the systems health over time or at the IPS logs.

Entry Name: "PKU-Chen-MC3"

VAST 2013 Challenge Mini-Challenge 3: Visual Analytics for Network Situation Awareness

Team Members:

Analytic Tools Used:

VAST 2013 Challenge
Mini-Challenge 3: Visual Analytics for Network Situation Awareness